57 research outputs found

    Applying Biomedical Ontologies on Semantic Query Expansion

    Get PDF
    *1- Introduction*

The interpretation of a question (or information need) depends, among other things, of a series of lexicalsemantic relations that complement and help the cognitive process of answering that information need. Despite this fact, currently used information retrieval mechanisms take few advantages of the semantic interpretation of users’ information needs (usually specified through keywords). In most of the cases, those mechanisms are based on keyword matching, and thus are excessively dependant on the query and document terms.

There are several past results showing that, in general, information retrieval based on domain knowledge decreases the accuracy of keyword based search engines. We believe this approach deserves further discussion and experimentation, looking for more strong evidences that these negative results can really be generalized. Moreover, there are some questions left unanswered by previous work that our experiment is addressing:

(_i_) Using a scientific ontology, with formal construction and maintenance processes, such as the OBO ontologies, would produce better results? 

(_ii_) Are there more efficient query expansion techniques using available domain knowledge?

(_iii_) Is a scientific ontology complete enough to fulfill the information retrieval researchers’ needs, in general?

*2- Semantic Query Expansion*

To try to answer some of these questions, we run a query expansion experiment using the Gene Ontology (GO) as domain knowledge. As the document repository, we used an extraction of 10 years of PubMed publications (from 1994 to 2004), which contains approximately 4.6 Million documents. This dataset is a test collection used by the information retrieval community, called Genomic TREC.

*3- Results*
To evaluate our ontology-based semantic query expansion technique, we measured the effectiveness of the information retrieval mechanism with and without expansion. In a nutshell, the average result showed an increase of 28% on synonyms relations and a small decrease on other relations.

Our results show a lot of consistence with past related work. In fact, if the expansion strategy does not selectively choose when and how to expand, only synonym relations are worth to be used. However, looking further, it is possible to find several opportunities to try other expansion strategies. For example, the problem with query expansion using generalization/specialization relationships is that, if it is always applied, the bad results are more frequent than the good ones. But, if the strategy is to be selective on when to use these relations for expansion, the increasing on accuracy can be outstanding. As shown by our experiment, there was a query with 98% increment on effectiveness. 

*4- Conclusion*
We strongly believe that it is premature to assume that semantics-based query expansion is, in general, a recall-enhancing, precision-degrading technique. Our experiments suggest that by using scientific based ontologies (like OBO ontologies) with formal relations, it is possible to increase both recall and precision. Our group is currently revising this first experiment towards a better semantic query expansion strategy.

*5- Acknowledgements*
This work was partially funded by CAPES and CNPq research grants 311454/2006-2, 306889/2007-2 and 484713/2007-8.

*References*
_Fox E. Lexical relations enhancing effectiveness of information retrieval systems. SIGIR Forum, New York, v.15, n.3, p.5-3._

_Voorhees E. Query expansion using lexicalsemantic relations. In: ACM SIGIR conference on research and development in information retrieval, Proceedings, Dublin:17, p.61–69, 1994

    PATAXÓ: A Framework to Allow Updates Through XML Views

    Get PDF
    XML has become an important medium for data exchange, and is frequently used as an interface to (i.e., a view of) a relational database. Although a lot of work has been done on querying relational databases through XML views, the problem of updating relational databases through XML views has not received much attention. In this work, we map XML views expressed using a subset of XQuery to a corresponding set of relational views. Thus, we transform the problem of updating relational databases through XML views into a classical problem of updating relational databases through relational views. We then show how updates on the XML view are mapped to updates on the corresponding relational views. Existing work on updating relational views can then be leveraged to determine whether or not the relational views are updatable with respect to the relational updates, and if so, to translate the updates to the underlying relational database

    Using XQuery to Build Updatable XML Views Over Relational Databases

    Get PDF
    XML has become an important medium for data exchange, and is frequently used as an interface to - i.e. a view of - a relational database. Although much attention has been paid to the problem of querying relational databases through XML views, the problem of updating relational databases through XML views has not been addressed. In this paper we investigate how a subset of XQuery can be used to build updatable XML views, so that an update to the view can be unambiguously translated to a set of updates on the underlying relational database, assuming that certain key and foreign key constraints hold. In particular, we show how views defined in this subset of XQuery can be mapped to a set of relational views, thus transforming the problem of updating relational databases through XML views into a classical problem of updating relational databases through relational views

    Gerenciando Alterações em Documentos XML

    Get PDF
    Documentos XML estão cada vez mais presentes em projetos dedesenvolvimento, e geralmente são colocados em sistemas de controle de versões juntamente com os outros arquivos do projeto. Estes sistemas não consideram o formato específico dos arquivos XML, tratando-os como arquivos comuns de texto. Desta forma, mesclagens automáticas podem gerar documentosmal formados silenciosamente, e exibir, durante comparações, alterações que não possuem qualquer relevância para documentos XML, freqüentemente mascarando alterações relevantes. Este artigo propõe uma abordagem para exibição de diferenças e apoio à mesclagem de arquivos XML, utilizando algoritmos clássicos propostos na literatura. A abordagem proposta permite a execução de alterações em paralelo de forma mais controlada, segura e eficiente para documentos XML

    An Improved Algorithm for Generating Database Transactions from Relational Algebra Specifications

    Full text link
    Alloy is a lightweight modeling formalism based on relational algebra. In prior work with Fisler, Giannakopoulos, Krishnamurthi, and Yoo, we have presented a tool, Alchemy, that compiles Alloy specifications into implementations that execute against persistent databases. The foundation of Alchemy is an algorithm for rewriting relational algebra formulas into code for database transactions. In this paper we report on recent progress in improving the robustness and efficiency of this transformation

    Uma arquitetura para compartilhamento de dados e recursos computacionais de armazenamento em redes P2P sociais

    Get PDF
    O gerenciamento e o acesso transparente a dados e aos recursos computacionais de armazenamento em ambientes altamente distribuídos de e-science, é um problema difícil. Uma alternativa para esse problema é o modelo de computação peer-to-peer (P2P) devido à possibilidade de agregar recursos computacionais de armazenamento sob demanda. Assim, redes P2P podem ser utilizadas para formar infraestruturas de ambientes computacionais altamente distribuídos. Como pesquisas normalmente envolvem colaboração, cientistas podem formar redes sociais sobre redes P2P para compartilhar recursos computacionais e dados de interesse de todos. Por isso, este trabalho propõe um modelo arquitetural para o compartilhamento de recursos computacionais de armazenamento e hospedagem colaborativa de dados compartilhados em redes P2P sociais. A arquitetura permite que recursos computacionais de armazenamento possam ser agregados em larga escala e, como consequência disso, grandes volumes de dados compartilhados possam ser armazenados sobre esses recursos

    SARAVÁ: data sharing for online communities in P2P

    Get PDF
    International audienceThis paper describes SARAVÁ, a research project that aims at investigating new challenges in P2P data sharing for online communities. The major advantage of P2P is a completely decentralized approach to data sharing which does not require centralized administration. Users may be in high numbers and interested in different kinds of collaboration and sharing their knowledge, ideas, experiences, etc. Data sources can be in high numbers, fairly autonomous, i.e. locally owned and controlled, and highly heterogeneous with different semantics and structures. Our project deals with new, decentralized data management techniques that scale up while addressing the autonomy, dynamic behavior and heterogeneity of both users and data sources. In this context, we focus on two major problems: query processing with uncertain data and management of scientific workflows

    De atualizações sobre visões XML para atualizações sobre visões relacionais: aplicando soluções antigas a um novo problema

    No full text
    XML has become an important medium for data exchange, and is frequently used as an interface to - i.e. a view of - a relational database. Although lots of work have been done on querying relational databases through XML views, the problem of updating relational databases through XML views has not received much attention. In this work, we give the rst steps towards solving this problem. Using query trees to capture the notions of selection, projection, nesting, grouping, and heterogeneous sets found throughout most XML query languages, we show how XML views expressed using query trees can be mapped to a set of corresponding relational views. Thus, we transform the problem of updating relational databases through XML views into a classical problem of updating relational databases through relational views. We then show how updates on the XML view are mapped to updates on the corresponding relational views. Existing work on updating relational views can then be leveraged to determine whether or not the relational views are updatable with respect to the relational updates, and if so, to translate the updates to the underlying relational database. Since query trees are a formal characterization of view de nition queries, they are not well suited for end-users. We then investigate how a subset of XQuery can be used as a top level language, and show how query trees can be used as an intermediate representation of view de nitions expressed in this subset.XML vem se tornando um importante meio para intercâmbio de dados, e é frequentemente usada com uma interface para - isto é, uma visão de - um banco de dados relacional. Apesar de existirem muitos trabalhos que tratam de consultas a bancos de dados através de visões XML, o problema de atualização de bancos de dados relacionais através de visões XML não tem recebido muita atenção. Neste trabalho, apresentam-se os primeiros passos para a solução deste problema. Usando query trees para capturar noções de seleção, projeção, aninhamento, agrupamento e conjuntos heterogêneos, presentes na maioria das linguagens de consulta XML, demonstra-se como visões XML expressas através de query trees podem ser mapeadas para um conjunto de visões relacionais correspondentes. Consequentemente, esta tese transforma o problema de atualização de bancos de dados relacionais através de visões XML em um problema clássico de atualização de bancos de dados através de visões relacionais. A partir daí, este trabalho mostra como atualizações na visão XML são mapeadas para atualizações sobre as visões relacionais correspondentes. Trabalhos existentes em atualização de visões relacionais podem então ser aplicados para determinar se as visões são atualizáveis com relação àquelas atualizações relacionais, e em caso a rmativo, traduzir as atualizações para o banco de dados relacional. Como query trees são uma caracterização formal de consultas de de nição de visões, elas não são adequadas para usuários nais. Diante disso, esta tese investiga como um subconjunto de XQuery pode ser usado como uma linguagem de de nição das visões, e como as query trees podem ser usadas como uma representação intermedi ária para consultas de nidas nesse subconjunto
    corecore